IBIS Macromodel Task Group Meeting date: 09 December 2014 Members (asterisk for those attending): Altera: David Banas ANSYS: * Dan Dvorscak * Curtis Clark Avago (LSI) Xingdong Dai Cadence Design Systems: * Ambrish Varma Brad Brim Kumar Keshavan Ken Willis Ericsson: Anders Ekholm IBM Steve Parker Intel: * Michael Mirmak Keysight Technologies: Fangyi Rao * Radek Biernacki Maxim Integrated Products: Hassan Rafat Mentor Graphics: * John Angulo * Arpad Muranyi Micron Technology: * Randy Wolff Justin Butterfield QLogic Corp. James Zhou Andy Joy eASIC Marc Kowalski SiSoft: * Walter Katz * Todd Westerhoff * Mike LaBonte Synopsys Rita Horner Teraspeed Consulting Group: Scott McMorrow Teraspeed Labs: * Bob Ross (Note: Agilent has changed to Keysight) The meeting was led by Arpad Muranyi. ------------------------------------------------------------------------ Opens: - Arpad reviewed our upcoming meeting schedule. - This is our last meeting of the year. [no objections] -------------------------- Call for patent disclosure: - None ------------- Review of ARs: - Walter send updated C_comp Model BIRD draft to Mike for posting. - Done - Todd produce slides for co-optimization requirements discussion. - Presenting today. - Arpad to review IBIS spec for min max issues. - In progress. ------------- New Discussion: Co-optimization: - Arpad - Motion to untable Back-channel discussion. - Todd - Second. [no one opposed] - Todd showed "IBIS-AMI and Co-optimization" presentation - Todd - Today we want to talk about: - What do we understand the requirements to be? - What is the problem to solve? - What are the required elements of the solution? - Todd - We will gloss over many details. - Lay it out from requirements down to high level flow. - Changes to .ami and .dlls. - More important to make sure we all understand what we are presenting. - Happy to take questions along the way. - I went to Wikipedia to see if the words we used are defined. - optimization - you have a numerical metric you can maximize or minimize. - Co-optimization, link training, back channel - all comical definitions. - These three terms don't have a standard definition we need to follow. - Problem we are trying to solve? - What are requirements for a potential solution? - Simulation models for devices that have a hardware runtime back channel. - Co-optimize Tx and Rx at runtime. - Cadence's original proposal. - Implication here is that models try to follow the hardware optimization protocol as closely as possible. - We recognize that whenever we optimize a system each decision affects the subsequent decisions (search path) and the final landing point. - We may have local minima and maxima. - Locally optimal solution because we are trying to model the exact search path, search path matters. - Designer's questions: - Will the link converge? - Do I need to pick presets? Which one? - What are the final eye margins? - Scenario #1 requirements - Emulate hardware training protocol as literally as possible. - Models must communicate at simulation runtime to do this. - What Cadence proposed to begin with. - What are our operating margins after training? - Want cross vendor (model maker) support. Interoperability. - Want to report the optimized IP settings (taps, etc.). - So we can correlate with lab measurements and verify. - Want a constrained optimization based on actual IP capabilities. - Want it based on actual tap granularities for example. - Some semiconductor vendors have said they want: - Want GetWave() based "literal" reproduction of what hardware does. - Also want the option of trying to approximate it with Init(). - Want support for a hardware starting point (hardware presets). - Want probes to work correctly. - If we probe a Tx output or somewhere in a channel we see the right behavior. - Scenario #2 requirements - Seems almost exactly like #1 from a modeling standpoint. - May be the source of a lot of the confusion. - Closely related to #1. - Why SiSoft has been adamant about one spec. that covers both. - Trying to optimize settings for hardware that does not communicate at hardware runtime. (no Back-channel). - Simulation standpoint - We want our models to act that way anyway. - Means that models are doing something beyond what hardware does. - Means we're thinking about things that are search path independent. - Assumes optimization is path independent, global optimum. - Design question is: - If we can program the Tx and Rx, how should we program them to optimize the system's performance? - We are all familiar with Rx models that do optimization already. - DFE, CTLE, VGA, Super loops. - This is beyond that. It's a Rx model that knows how to talk to some Tx. - Designer's questions are similar (to #1): - Can this link work with this IP? - How should I set them up? - What are the margins once I'm done? - Scenario #1 and #2 almost identical for simulation methodology. - Problem to be solved is slightly different: - Scenario #1 - Predict what hardware will do when it trains itself. - Scenario #2 - Program the hardware upfront and let Rx adjust itself. - Michael Mirmak - I'm interpreting you to mean: - Scenario #1 - Objective is to duplicate the algorithm that is in silicon. - Therefore, implication is that we should defer to the model since it should be written to represent the actual silicon. - Getting the Tx and Rx to communicate is then the issue. - Scenario #2 - System simulation person might want to use their own optimization technique. Tool will have to get more involved. - Is that a fair summary? - Todd - The million dollar question for later is, "Who does the optimizing?". - We will propose that the receiver does it. - Some of my customers have systems with thousands of links in them. - Systems where Txs don't talk to Rxs. - Even if you put Rx in auto mode, you need to know how to program the Tx. - Arpad - A variation on what Mike said is: - Not only does the user think they can do a better job with the optimization, the silicon itself might not do any. - Todd - [continuing to next slide] - Why Scenario #2? - Starting point for lab validation. Figure out how to program Txs. - Need settings on a per link basis. - Want to optimize them. - Currently people often bin channels according to length (loss). - Not good enough if you want to get every last bit of margin. - Some customers want this per link optimization. - Enabling AMI models to communicate and cross optimize in the general case, even when the hardware doesn't do it. - Does that make sense? - Arpad - Yes, that is what my comment was stating. - Todd - Scenario #2 requirements: - Similar but a bit different. - Want to know optimized settings and margins. - Particularly want to know the optimized settings. - Need to set up the hardware. Txs won't set themselves. - Want to be able to extract optimized Tx settings. - Want cross vendor interoperability. - Need a way to prove results are valid. - We use this optimization and get some settings for Tx and Rx. - Subsequent simulation run with same settings programmed directly should give same result. - High throughput is important. - Optimize 4000 links in an overnight run as a goal. - Fully constrain the optimized solution based on actual IP settings. - Probes work correctly. - We want user selectable optimization criteria. - Since we're not duplicating a specific hardware algorithm in this case, we could have multiple options. - Might want to use different metrics for optimization. - MM - On that fifth point ["Constrain solution based on IP capabilities"]: - I think we need to be extremely cautious if we say that without getting into the difference between "capabilities" and "algorithm." - It can make scenario #2 sound like it's scenario #1. - You could optimize according to the IP capabilities, but use an entirely different optimization algorithm from what the silicon has. - You could get a properly constrained optimization, but an optimization done with an entirely different algorithm so you don't get the right results. - Constrain the answer, but not necessarily guarantee the same answer. - Many people don't see the distinction. - Walter - My 2 cents: - Two questions to be answered: - 1. Does the model really represent the silicon. Is it right? - 2. If the model is right, if the search algorithm in your software is not very good you may not find the same minimum the hardware would find. - MM - Excellent, yes. - Walter - The constrained solution is based on the limits of each of the taps. - But it's not talking about the algorithm that searches the Tx space. - MM - As long as that's abundantly clear. - Walter - Yes, it has to be abundantly clear. - Hopefully we will make it clear in this presentation. - Todd - [moving on] Introduce some terminology: - Adaptation - Any behavior in an AMI model that changes on a bit-by-bit basis (GetWave()). - Eye Quality Metric (EQM) - numeric measure of eye quality. - What is optimized. - Self Optimization - Adjusting internal behavior to optimize EQM. - Mainly with Rx models. - Assumes Rx model has some kind of EQM that it computes and uses to adapt. - Co-Optimization - Simultaneously adjusting Tx and Rx by any method. - Co-Optimization by Proxy. - Rx model is doing the Tx optimization in place of the Tx. - New concept we'll discuss here. - Key questions we try to answer: - What's being optimized? - By whom? - Local or global? - [showing flow slides] - Basic AMI flow - no optimization, models just running. - Self Optimization flow - - Implies additional logical block in the Rx .dll that can monitor the waveform, clocks, etc. and compute an EQM and come up with new settings. - A little control loop in the Rx model itself. - Shown in flow as two distinct blocks even though it's in one .dll. - Co-Optimization by Proxy - Special case of Co-Optimization using a matched set. - Tx model equalization is disabled. - Rx actually provides Tx equalization in place of the Tx. - Rx can ultimately report back its optimized settings for the Tx. - User can plug them back in to the Tx and run the analysis. - Pro - Requires no change in the current AMI flow. - Current simulators could do it. - Con - Only works for paired models from the same vendor. - Rx knows a lot about the details of the Tx. - The EQM optimization function in the .dll can adjust the Rx and proxy Tx. - All three in the Rx .dll. - Arpad - How could this proxy Tx be used effectively if its output doesn't go through the channel? - Todd - The channel has come into it. - Good point, the assumption is the Tx's equalization is LTI. - So it doesn't matter where we apply it. - Arpad - In the LTI system the order between the Tx and channel doesn't matter. - Todd - Yes. - Arpad - Would Tx side of the diagram have to be modified? - Tool would have to stimulate the channel directly, not through the Tx. - Todd - Tx has analog we need to capture (bandwidth limit, reflections, etc.). - Tx also has algorithmic behavior. - Walter - Arpad, note the "no EQ" next to the Tx block. - Tx has no equalization. All Tx equalization is done in the Rx proxy. - Arpad - Okay, I missed that ["no EQ"]. - John - There could be jitter associated with the Tx. - That would throw a monkey wrench into this [LTI assumption]. - Todd - That's true. - This is not perfect. - I'm just observing that some people have used this with success. - Arpad - What is the motivation? Why would someone want to do this? - Walter - Without a back-channel or link optimization BIRD, it's the only way. - Only way to have an Rx optimize the Tx. - Todd - Alternatively, let's go back to the stock AMI slide. - Tx, Rx with auto mode (adaptive CTLE, DFE). - Million dollar question - how to I set up my transmitter. - If I look at all the combinations of Tx taps at minimum resolution, there could be thousands. - Probably not practical. - Find methods with coarser studies, or go with Tx Rx pair and proxy solution. - Todd - Co-Optimization Analysis Flow. - Much of this very similar to what Cadence already observed. - In hardware, link training (co-optimization) happens before the system runs. - When you're doing training, normal system operation will follow. - We want simulation based co-optimization to do the same thing. - Today - IBIS 6.0. - Network characterization phase (impulse response). - Followed by channel simulation. - Co-optimization (proposed) - Network characterization as we always did it. - Co-optimization phase (Init() and GetWave()). - Go into channel simulation, but don't rerun Init(). - Todd - [Co-Optimization block diagram slide] - Trying to show the functional blocks, regardless of how they'd be packaged. - Tx Exploration Algorithm in Rx model. - That algorithm has to be able to send a message back to the Tx. - TxConfigurator block. - Level of indirection. - Accepts message from Exploration Algorithm and translates for the Tx. - May be a trivial pass through or may need to map them. - Walter has discussed tap coefficients vs. increments for example. - This stuff all gathered in Tx and Rx .dlls. - What are we optimizing? - EQM - quantitative metric internal to the Rx model. - Need not be defined to the outside world at all. - But it does drive the Rx model's optimization. - Who's doing the optimizing? - In our view, the Rx model. - What algorithm is getting used? - In scenario #1, following exact protocol the hardware follows. - Quite possibly a local optimum. - Scenario #2, we would have an algorithm. - Expect the optimum to be more global. - What training modes do we need to support? - Scenario #1, requirement #1. - Bit by Bit hardware simulation (GetWave()). - Scenario #2, requirement #4. (high throughput) - We think it requires Init() based optimization. - Diagram Flows - Time Domain Link training. - Flow we've discussed many times. - Tx and Rx GetWave() run in a training mode. - Statistical (Init() based) training. - Need training mode in Init(). - But Init() is currently not re-entrant. - It does initialization, memory allocation, impulse processing. - Our proposal is a new function. - AMI_Impulse() - like the GetWave() version of Init(). - Does not do the initialization, memory allocation, etc. - Call Init() once, then call AMI_Impulse() as often as needed. - Arpad - Couldn't this have been done just by repeatedly calling Init()? - MM - But Init() allocates memory, right? - Todd - We could do it with Init() if you come up with a way to tell Init() not to do memory allocation every time. - If we wanted to change Init() behavior, we could. - We are proposing just creating this new function. - Arpad - Okay, I understand. - John - You could attack it with an AMI parameter that tells Init() to behave differently, but good luck to people learning that spec. - Todd - We had this debate of new function vs. changing legacy stuff. - ML - This is actually an easy architecture to implement. - Implement an Init() function that does allocation and then calls AMI_Impulse() to do everything else. - MM - There are tools out there that already do this. - Todd - [moving on] - Init() training followed by Get_Wave() training. - Each can loop until it's done and then move on. - Final step in diagram shows additional step we hypothesized.: - Train in Init(). - Refine in GetWave(). - Go back and get a final refinement of the impulse response for a statistical simulation flow. - If we buy into all these flows, how do we handle flow control? - Tx and Rx read and write link training data. - Propose a state variable the simulator uses to tell the models what state they are operating under. - Rx model returns a state value indicating: - keep training - stop training - abort - Rx model really controls when a phase of training is complete. - In terms of .dlls. - Init() doesn't change. - New AMI_Impulse() function for repeated calls. - GetWave() - need BIRD 128 for parameters_InOut. - That's enough for today. - Does what I said make sense? - MM - This makes sense. - Particular for cases where you really want to simulate everything using impulse responses only. - There are plenty of cases where you never want to touch GetWave(), and this does it. - Todd - Okay. - I just want to know if I made sense to everyone. - Arpad - Trying to look at this from a distance compared to BIRD 147. - Only difference I can see is what you described earlier in the presentation. - BIRD 147 addresses simulating exactly what the physical device does. - This addresses everything else. - Is that a fair summary? - Todd - They're largely similar. - BIRD 147 has had bits and pieces added over time to address some of this. - I don't think BIRD 147 is expressed that clearly. - Biggest differences in implementation: - Re-entrant entry point for Init() based processing. - Pull out state variables in a more dedicated way. - Walter - There will be another difference we haven't gotten to yet. - That is: How to describe the messages getting sent back and forth? - ML - Could go a lot further but need a longer meeting for that. - Really important to go back to slides 6 through 8 and talk about the requirements first. - Arpad - This hasn't yet touched on the some of the BIRD 147 debates over the .bci file. - Todd - We originally said we were going to talk about market requirements. - Probably up through about slide 10 today. - Wanted to give you a broader context about what's coming. - Radek - One question comes to my mind. - Why do you insist on the optimization being done by the Rx. - There are general optimization tools available in many EDA platforms. - Design optimization is an established process, not just blind sweeps. - Don't need to overload the Rx with that responsibility. - I think it's a valid approach. - Would like to understand why we don't include it in this discussion. - Todd - Really good point. - Scenario #1 - Literally emulating what hardware does. - Radek - Yes, but Scenario #2 is open for other applications. - Walter - I think what you're talking about is the Tx Exploration Algorithm. - One thing is to put in into the Rx. - What you're saying is there might be some other external algorithm. - In that case the Rx would have to report an EQM as an output. - Radek - I don't want to have a prolonged discussion at this moment. - I just wanted to mention it for thought. - Todd - You make a good point. - I think optimizers are more prevalent in the traditional microwave space. - Not sure if all tools in this space have them built in. - If we want to pursue it, then what info will the external optimizer need? - Radek - Exactly. - Arpad - Interface to these models may need options for this. - Parameters that go to the Tx may not care who did the optimization. - Todd - Walter has been on this for awhile. - Trying to characterize Tx models. - Come up with defined data to describe them. - We could then allow other algorithms to treat them as black boxes. - Walter hasn't gotten traction on that, but we've been on it. - Arpad - Okay, thank you for the presentation. - Thank you everyone for all the good work this year. - Happy Holidays and Happy New Year. ------------- Next meeting: 06 Jan 2015 12:00pm PT ------------- IBIS Interconnect SPICE Wish List: 1) Simulator directives